Goto

Collaborating Authors

 compact data representation


Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Neural Information Processing Systems

Learning useful representations is a key ingredient to the success of modern machine learning. Currently, representation learning mostly relies on embedding data into Euclidean space. However, recent work has shown that data in some domains is better modeled by non-euclidean metric spaces, and inappropriate geometry can result in inferior performance. In this paper, we aim to eliminate the inductive bias imposed by the embedding space geometry. Namely, we propose to map data into more general non-vector metric spaces: a weighted graph with a shortest path distance. By design, such graphs can model arbitrary geometry with a proper configuration of edges and weights. Our main contribution is PRODIGE: a method that learns a weighted graph representation of data end-to-end by gradient descent. Greater generality and fewer model assumptions make PRODIGE more powerful than existing embedding-based approaches. We confirm the superiority of our method via extensive experiments on a wide range of tasks, including classification, compression, and collaborative filtering.


Reviews: Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Neural Information Processing Systems

Except the presence of each edge is probabilistic than deterministic, the core idea is quite similar to Isomap. The novelty should be better addressed by comparing to Isomap. For example, edges between words that frequently co-occur in the same contexts are not independent to each other. Edges between pixels in small coherent regions are not independent. Do we eventually need to know such dependency structures a priori to correctly represent arbitrary geometry in the data?


Reviews: Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Neural Information Processing Systems

The paper proposed a quite interesting idea of representing data by weighted graphs (shortest path between nodes). Reviewers have raised concerns on edge dependency and the given similarity metric. However, I'm less worried about making the independence assumption because after all, it's a model, and it seems to work well in experiments. Likewise, it is also common in variational inference to use independent distribution to approximate a graphical model, based on which learning is carried out. What interests me more is the general methodology of optimization.


Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Neural Information Processing Systems

Learning useful representations is a key ingredient to the success of modern machine learning. Currently, representation learning mostly relies on embedding data into Euclidean space. However, recent work has shown that data in some domains is better modeled by non-euclidean metric spaces, and inappropriate geometry can result in inferior performance. In this paper, we aim to eliminate the inductive bias imposed by the embedding space geometry. Namely, we propose to map data into more general non-vector metric spaces: a weighted graph with a shortest path distance.